Dataset statistics
| Number of variables | 9 |
|---|---|
| Number of observations | 100000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 6.9 MiB |
| Average record size in memory | 72.0 B |
Variable types
| DateTime | 1 |
|---|---|
| Numeric | 8 |
SYM/H_INDEX_nT is highly correlated with 1-M_AE_nT and 2 other fields | High correlation |
1-M_AE_nT is highly correlated with SYM/H_INDEX_nT and 1 other fields | High correlation |
400kmDensity is highly correlated with SYM/H_INDEX_nT and 5 other fields | High correlation |
DAILY_SUNSPOT_NO_ is highly correlated with 400kmDensity and 2 other fields | High correlation |
DAILY_F10.7_ is highly correlated with 400kmDensity and 2 other fields | High correlation |
3-H_KP*10_ is highly correlated with SYM/H_INDEX_nT and 2 other fields | High correlation |
irradiance (W/m^2/nm) is highly correlated with 400kmDensity and 2 other fields | High correlation |
d_diff is highly correlated with 400kmDensity | High correlation |
Datetime has unique values | Unique |
SYM/H_INDEX_nT has 2746 (2.7%) zeros | Zeros |
DAILY_SUNSPOT_NO_ has 24338 (24.3%) zeros | Zeros |
3-H_KP*10_ has 8469 (8.5%) zeros | Zeros |
d_diff has 1572 (1.6%) zeros | Zeros |
Reproduction
| Analysis started | 2022-11-17 20:34:12.314495 |
|---|---|
| Analysis finished | 2022-11-17 20:34:32.393985 |
| Duration | 20.08 seconds |
| Software version | pandas-profiling v3.4.0 |
| Download configuration | config.json |
| Distinct | 100000 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 781.4 KiB |
| Minimum | 2002-08-01 00:32:00 |
|---|---|
| Maximum | 2012-06-30 23:54:00 |
Histogram with fixed size bins (bins=50)
| Distinct | 320 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -11.42478 |
| Minimum | -451 |
|---|---|
| Maximum | 101 |
| Zeros | 2746 |
| Zeros (%) | 2.7% |
| Negative | 76463 |
| Negative (%) | 76.5% |
| Memory size | 781.4 KiB |
Quantile statistics
| Minimum | -451 |
|---|---|
| 5-th percentile | -41 |
| Q1 | -18 |
| median | -8 |
| Q3 | -1 |
| 95-th percentile | 9 |
| Maximum | 101 |
| Range | 552 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 19.40470065 |
|---|---|
| Coefficient of variation (CV) | -1.698474776 |
| Kurtosis | 58.11383626 |
| Mean | -11.42478 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | -4.641146228 |
| Sum | -1142478 |
| Variance | 376.5424074 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| -3 | 3565 | 3.6% |
| -2 | 3484 | 3.5% |
| -5 | 3479 | 3.5% |
| -7 | 3454 | 3.5% |
| -8 | 3406 | 3.4% |
| -4 | 3377 | 3.4% |
| -6 | 3305 | 3.3% |
| -1 | 3289 | 3.3% |
| -9 | 3268 | 3.3% |
| -10 | 3048 | 3.0% |
| Other values (310) | 66325 |
| Value | Count | Frequency (%) |
| -451 | 1 | |
| -446 | 2 | |
| -420 | 1 | |
| -398 | 1 | |
| -392 | 1 | |
| -377 | 1 | |
| -376 | 1 | |
| -374 | 1 | |
| -371 | 1 | |
| -368 | 1 |
| Value | Count | Frequency (%) |
| 101 | 1 | |
| 86 | 1 | |
| 79 | 1 | |
| 78 | 1 | |
| 76 | 1 | |
| 75 | 1 | |
| 71 | 1 | |
| 65 | 1 | |
| 63 | 1 | |
| 61 | 1 |
| Distinct | 1430 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 175.03421 |
| Minimum | 1 |
|---|---|
| Maximum | 3415 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 781.4 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 15 |
| Q1 | 38 |
| median | 87 |
| Q3 | 233 |
| 95-th percentile | 616 |
| Maximum | 3415 |
| Range | 3414 |
| Interquartile range (IQR) | 195 |
Descriptive statistics
| Standard deviation | 212.5372747 |
|---|---|
| Coefficient of variation (CV) | 1.214261342 |
| Kurtosis | 9.388712433 |
| Mean | 175.03421 |
| Median Absolute Deviation (MAD) | 61 |
| Skewness | 2.465619817 |
| Sum | 17503421 |
| Variance | 45172.09312 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 32 | 936 | 0.9% |
| 38 | 933 | 0.9% |
| 33 | 930 | 0.9% |
| 27 | 925 | 0.9% |
| 30 | 919 | 0.9% |
| 29 | 914 | 0.9% |
| 35 | 905 | 0.9% |
| 25 | 904 | 0.9% |
| 26 | 895 | 0.9% |
| 36 | 890 | 0.9% |
| Other values (1420) | 90849 |
| Value | Count | Frequency (%) |
| 1 | 2 | < 0.1% |
| 2 | 19 | < 0.1% |
| 3 | 61 | 0.1% |
| 4 | 131 | 0.1% |
| 5 | 185 | 0.2% |
| 6 | 284 | |
| 7 | 327 | |
| 8 | 375 | |
| 9 | 472 | |
| 10 | 506 |
| Value | Count | Frequency (%) |
| 3415 | 1 | |
| 3338 | 1 | |
| 2847 | 1 | |
| 2819 | 1 | |
| 2747 | 1 | |
| 2467 | 1 | |
| 2357 | 1 | |
| 2304 | 1 | |
| 2284 | 1 | |
| 2280 | 1 |
| Distinct | 99375 |
|---|---|
| Distinct (%) | 99.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.488857604 × 10-12 |
| Minimum | 1.004137 × 10-15 |
|---|---|
| Maximum | 2.409958 × 10-11 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 781.4 KiB |
Quantile statistics
| Minimum | 1.004137 × 10-15 |
|---|---|
| 5-th percentile | 2.0491354 × 10-13 |
| Q1 | 4.98625825 × 10-13 |
| median | 9.643517 × 10-13 |
| Q3 | 1.93218925 × 10-12 |
| 95-th percentile | 4.63733015 × 10-12 |
| Maximum | 2.409958 × 10-11 |
| Range | 2.409857586 × 10-11 |
| Interquartile range (IQR) | 1.433563425 × 10-12 |
Descriptive statistics
| Standard deviation | 1.479053648 × 10-12 |
|---|---|
| Coefficient of variation (CV) | 0.9934151151 |
| Kurtosis | 0 |
| Mean | 1.488857604 × 10-12 |
| Median Absolute Deviation (MAD) | 5.7618255 × 10-13 |
| Skewness | 0 |
| Sum | 1.488857604 × 10-7 |
| Variance | 2.187599694 × 10-24 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 1.063028 × 10-12 | 3 | < 0.1% |
| 2.148843 × 10-12 | 3 | < 0.1% |
| 1.013015 × 10-12 | 3 | < 0.1% |
| 1.374756 × 10-12 | 3 | < 0.1% |
| 1.800618 × 10-12 | 2 | < 0.1% |
| 1.387199 × 10-12 | 2 | < 0.1% |
| 1.722024 × 10-12 | 2 | < 0.1% |
| 1.444553 × 10-12 | 2 | < 0.1% |
| 1.136125 × 10-13 | 2 | < 0.1% |
| 1.448267 × 10-12 | 2 | < 0.1% |
| Other values (99365) | 99976 |
| Value | Count | Frequency (%) |
| 1.004137 × 10-15 | 1 | |
| 1.739733 × 10-15 | 1 | |
| 3.272275 × 10-15 | 1 | |
| 4.086779 × 10-15 | 1 | |
| 4.178992 × 10-15 | 1 | |
| 4.24821 × 10-15 | 1 | |
| 4.338149 × 10-15 | 1 | |
| 4.40762 × 10-15 | 1 | |
| 5.317355 × 10-15 | 1 | |
| 6.15633 × 10-15 | 1 |
| Value | Count | Frequency (%) |
| 2.409958 × 10-11 | 1 | |
| 1.981068 × 10-11 | 1 | |
| 1.765326 × 10-11 | 1 | |
| 1.68191 × 10-11 | 1 | |
| 1.671561 × 10-11 | 1 | |
| 1.534461 × 10-11 | 1 | |
| 1.497177 × 10-11 | 1 | |
| 1.4876 × 10-11 | 1 | |
| 1.459356 × 10-11 | 1 | |
| 1.392578 × 10-11 | 1 |
| Distinct | 214 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 47.67445 |
| Minimum | 0 |
|---|---|
| Maximum | 281 |
| Zeros | 24338 |
| Zeros (%) | 24.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 781.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 8 |
| median | 31 |
| Q3 | 73 |
| 95-th percentile | 149 |
| Maximum | 281 |
| Range | 281 |
| Interquartile range (IQR) | 65 |
Descriptive statistics
| Standard deviation | 50.63987748 |
|---|---|
| Coefficient of variation (CV) | 1.062201609 |
| Kurtosis | 1.428870838 |
| Mean | 47.67445 |
| Median Absolute Deviation (MAD) | 31 |
| Skewness | 1.295382476 |
| Sum | 4767445 |
| Variance | 2564.397191 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 24338 | |
| 13 | 2478 | 2.5% |
| 12 | 2376 | 2.4% |
| 15 | 1929 | 1.9% |
| 14 | 1821 | 1.8% |
| 18 | 1405 | 1.4% |
| 26 | 1358 | 1.4% |
| 16 | 1351 | 1.4% |
| 11 | 1242 | 1.2% |
| 23 | 1039 | 1.0% |
| Other values (204) | 60663 |
| Value | Count | Frequency (%) |
| 0 | 24338 | |
| 5 | 57 | 0.1% |
| 6 | 138 | 0.1% |
| 7 | 318 | 0.3% |
| 8 | 256 | 0.3% |
| 9 | 493 | 0.5% |
| 10 | 799 | 0.8% |
| 11 | 1242 | 1.2% |
| 12 | 2376 | 2.4% |
| 13 | 2478 | 2.5% |
| Value | Count | Frequency (%) |
| 281 | 27 | < 0.1% |
| 279 | 28 | < 0.1% |
| 270 | 39 | |
| 267 | 30 | |
| 263 | 37 | |
| 252 | 40 | |
| 250 | 74 | |
| 248 | 59 | |
| 247 | 55 | |
| 239 | 30 |
| Distinct | 927 |
|---|---|
| Distinct (%) | 0.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 97.696717 |
| Minimum | 65.1 |
|---|---|
| Maximum | 999.9 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 781.4 KiB |
Quantile statistics
| Minimum | 65.1 |
|---|---|
| 5-th percentile | 67.5 |
| Q1 | 71.5 |
| median | 85.2 |
| Q3 | 111.2 |
| 95-th percentile | 158.6 |
| Maximum | 999.9 |
| Range | 934.8 |
| Interquartile range (IQR) | 39.7 |
Descriptive statistics
| Standard deviation | 54.35243319 |
|---|---|
| Coefficient of variation (CV) | 0.5563383792 |
| Kurtosis | 184.3572119 |
| Mean | 97.696717 |
| Median Absolute Deviation (MAD) | 15.6 |
| Skewness | 11.50422375 |
| Sum | 9769671.7 |
| Variance | 2954.186994 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 69.3 | 834 | 0.8% |
| 68 | 743 | 0.7% |
| 69.8 | 714 | 0.7% |
| 68.8 | 705 | 0.7% |
| 69.5 | 680 | 0.7% |
| 68.2 | 641 | 0.6% |
| 67.4 | 639 | 0.6% |
| 70.3 | 634 | 0.6% |
| 70.2 | 631 | 0.6% |
| 68.5 | 618 | 0.6% |
| Other values (917) | 93161 |
| Value | Count | Frequency (%) |
| 65.1 | 25 | < 0.1% |
| 65.2 | 37 | < 0.1% |
| 65.5 | 38 | < 0.1% |
| 65.6 | 33 | < 0.1% |
| 65.8 | 67 | 0.1% |
| 65.9 | 77 | 0.1% |
| 66 | 92 | 0.1% |
| 66.1 | 103 | 0.1% |
| 66.2 | 258 | |
| 66.3 | 220 |
| Value | Count | Frequency (%) |
| 999.9 | 246 | |
| 275.4 | 33 | < 0.1% |
| 270.9 | 31 | < 0.1% |
| 267.6 | 41 | < 0.1% |
| 254 | 30 | < 0.1% |
| 246.9 | 39 | < 0.1% |
| 245.2 | 30 | < 0.1% |
| 242.6 | 34 | < 0.1% |
| 240.6 | 30 | < 0.1% |
| 232.8 | 35 | < 0.1% |
| Distinct | 28 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 18.04697 |
| Minimum | 0 |
|---|---|
| Maximum | 90 |
| Zeros | 8469 |
| Zeros (%) | 8.5% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 781.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 7 |
| median | 17 |
| Q3 | 27 |
| 95-th percentile | 43 |
| Maximum | 90 |
| Range | 90 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 14.17234393 |
|---|---|
| Coefficient of variation (CV) | 0.7853032352 |
| Kurtosis | 0.7967219941 |
| Mean | 18.04697 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 0.9143439097 |
| Sum | 1804697 |
| Variance | 200.8553324 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=28)
| Value | Count | Frequency (%) |
| 3 | 11361 | |
| 7 | 11179 | |
| 10 | 9455 | |
| 13 | 8566 | |
| 0 | 8469 | |
| 17 | 8157 | |
| 20 | 7392 | |
| 23 | 6501 | 6.5% |
| 27 | 6071 | 6.1% |
| 30 | 5329 | 5.3% |
| Other values (18) | 17520 |
| Value | Count | Frequency (%) |
| 0 | 8469 | |
| 3 | 11361 | |
| 7 | 11179 | |
| 10 | 9455 | |
| 13 | 8566 | |
| 17 | 8157 | |
| 20 | 7392 | |
| 23 | 6501 | |
| 27 | 6071 | |
| 30 | 5329 |
| Value | Count | Frequency (%) |
| 90 | 13 | < 0.1% |
| 87 | 43 | < 0.1% |
| 83 | 50 | 0.1% |
| 80 | 21 | < 0.1% |
| 77 | 71 | 0.1% |
| 73 | 113 | 0.1% |
| 70 | 92 | 0.1% |
| 67 | 104 | 0.1% |
| 63 | 174 | |
| 60 | 296 |
| Distinct | 3231 |
|---|---|
| Distinct (%) | 3.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.005515761088 |
| Minimum | 0.004873058293 |
|---|---|
| Maximum | 0.007349349558 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 781.4 KiB |
Quantile statistics
| Minimum | 0.004873058293 |
|---|---|
| 5-th percentile | 0.004923261702 |
| Q1 | 0.005049338564 |
| median | 0.005327608902 |
| Q3 | 0.005854960997 |
| 95-th percentile | 0.006626430433 |
| Maximum | 0.007349349558 |
| Range | 0.002476291265 |
| Interquartile range (IQR) | 0.0008056224324 |
Descriptive statistics
| Standard deviation | 0.0005479276685 |
|---|---|
| Coefficient of variation (CV) | 0.09933854272 |
| Kurtosis | 0.1077951908 |
| Mean | 0.005515761088 |
| Median Absolute Deviation (MAD) | 0.0003593955189 |
| Skewness | 0.9265807933 |
| Sum | 551.5761088 |
| Variance | 3.002247299 × 10-7 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0.00495163817 | 108 | 0.1% |
| 0.00491212029 | 94 | 0.1% |
| 0.004925216082 | 85 | 0.1% |
| 0.004914534744 | 82 | 0.1% |
| 0.004895772319 | 76 | 0.1% |
| 0.004930291791 | 74 | 0.1% |
| 0.00498051988 | 72 | 0.1% |
| 0.00519876834 | 72 | 0.1% |
| 0.004969654139 | 72 | 0.1% |
| 0.005658049602 | 71 | 0.1% |
| Other values (3221) | 99194 |
| Value | Count | Frequency (%) |
| 0.004873058293 | 27 | |
| 0.004877128173 | 29 | |
| 0.004877185915 | 28 | |
| 0.004877588246 | 30 | |
| 0.004881324712 | 15 | |
| 0.004881698173 | 15 | |
| 0.004881755915 | 37 | |
| 0.00488556223 | 35 | |
| 0.004885710776 | 29 | |
| 0.004885739647 | 22 |
| Value | Count | Frequency (%) |
| 0.007349349558 | 32 | |
| 0.00734248152 | 36 | |
| 0.007334709167 | 35 | |
| 0.007301890757 | 35 | |
| 0.007268224377 | 29 | |
| 0.007266042288 | 33 | |
| 0.007259562146 | 39 | |
| 0.007257604506 | 36 | |
| 0.007247306872 | 19 | |
| 0.007218547165 | 23 |
| Distinct | 96495 |
|---|---|
| Distinct (%) | 96.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | -5.753678404 × 10-16 |
| Minimum | -1.1317823 × 10-11 |
|---|---|
| Maximum | 7.0179446 × 10-12 |
| Zeros | 1572 |
| Zeros (%) | 1.6% |
| Negative | 48068 |
| Negative (%) | 48.1% |
| Memory size | 781.4 KiB |
Quantile statistics
| Minimum | -1.1317823 × 10-11 |
|---|---|
| 5-th percentile | -1.669487 × 10-13 |
| Q1 | -3.603625 × 10-14 |
| median | 4.405 × 10-16 |
| Q3 | 3.793225 × 10-14 |
| 95-th percentile | 1.6105214 × 10-13 |
| Maximum | 7.0179446 × 10-12 |
| Range | 1.83357676 × 10-11 |
| Interquartile range (IQR) | 7.39685 × 10-14 |
Descriptive statistics
| Standard deviation | 1.788330009 × 10-13 |
|---|---|
| Coefficient of variation (CV) | -310.8150791 |
| Kurtosis | 0 |
| Mean | -5.753678404 × 10-16 |
| Median Absolute Deviation (MAD) | 3.69934 × 10-14 |
| Skewness | 0 |
| Sum | -5.753678404 × 10-11 |
| Variance | 3.19812422 × 10-26 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 1572 | 1.6% |
| -3.9281 × 10-14 | 4 | < 0.1% |
| 3.7324 × 10-14 | 4 | < 0.1% |
| 4.4943 × 10-14 | 4 | < 0.1% |
| 4.2494 × 10-14 | 3 | < 0.1% |
| -1.039 × 10-15 | 3 | < 0.1% |
| -6.155 × 10-15 | 3 | < 0.1% |
| 2.2696 × 10-14 | 3 | < 0.1% |
| -7.928 × 10-15 | 3 | < 0.1% |
| 5.02 × 10-16 | 3 | < 0.1% |
| Other values (96485) | 98398 |
| Value | Count | Frequency (%) |
| -1.1317823 × 10-11 | 1 | |
| -6.4132083 × 10-12 | 1 | |
| -6.3550716 × 10-12 | 1 | |
| -5.724647 × 10-12 | 1 | |
| -5.393795 × 10-12 | 1 | |
| -5.2722229 × 10-12 | 1 | |
| -4.9769291 × 10-12 | 1 | |
| -4.695843 × 10-12 | 1 | |
| -4.645975 × 10-12 | 1 | |
| -4.521292 × 10-12 | 1 |
| Value | Count | Frequency (%) |
| 7.0179446 × 10-12 | 1 | |
| 5.45498 × 10-12 | 1 | |
| 3.871182 × 10-12 | 1 | |
| 3.78975651 × 10-12 | 1 | |
| 3.6282064 × 10-12 | 1 | |
| 3.5397972 × 10-12 | 1 | |
| 3.317901 × 10-12 | 1 | |
| 3.1141065 × 10-12 | 1 | |
| 3.07818 × 10-12 | 1 | |
| 3.059079 × 10-12 | 1 |
Auto
The auto setting is an easily interpretable pairwise column metric of the following mapping: vartype-vartype : method, categorical-categorical : Cramer's V, numerical-categorical : Cramer's V (using a discretized numerical column), numerical-numerical : Spearman's ρ. This configuration uses the best suitable for each pair of columns.Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| Datetime | SYM/H_INDEX_nT | 1-M_AE_nT | 400kmDensity | DAILY_SUNSPOT_NO_ | DAILY_F10.7_ | 3-H_KP*10_ | irradiance (W/m^2/nm) | d_diff | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-10-15 02:25:00 | -12.0 | 178.0 | 4.789873e-12 | 148.0 | 136.9 | 33.0 | 0.006212 | -1.369510e-13 |
| 1 | 2006-11-01 00:43:00 | -21.0 | 154.0 | 9.967436e-13 | 53.0 | 85.3 | 17.0 | 0.005186 | -1.766994e-13 |
| 2 | 2004-08-29 19:58:00 | 20.0 | 54.0 | 7.392175e-13 | 27.0 | 87.7 | 13.0 | 0.005606 | 3.271490e-14 |
| 3 | 2009-06-24 18:53:00 | 42.0 | 74.0 | 4.133310e-13 | 17.0 | 69.1 | 50.0 | 0.004944 | 2.687410e-14 |
| 4 | 2011-11-07 11:11:00 | 4.0 | 26.0 | 7.050772e-12 | 173.0 | 178.9 | 7.0 | 0.006449 | -4.988700e-14 |
| 5 | 2003-11-21 03:36:00 | -186.0 | 240.0 | 7.717244e-12 | 119.0 | 172.8 | 60.0 | 0.006344 | 1.844890e-13 |
| 6 | 2008-04-19 03:57:00 | 2.0 | 112.0 | 8.577820e-13 | 10.0 | 71.7 | 17.0 | 0.004954 | 4.133510e-14 |
| 7 | 2004-01-31 04:46:00 | 1.0 | 101.0 | 1.303272e-12 | 62.0 | 91.6 | 17.0 | 0.005664 | 4.805200e-14 |
| 8 | 2005-11-02 01:42:00 | -20.0 | 104.0 | 1.149596e-12 | 29.0 | 76.8 | 17.0 | 0.005200 | 1.030850e-13 |
| 9 | 2003-10-13 00:41:00 | 3.0 | 206.0 | 1.121241e-12 | 19.0 | 94.0 | 33.0 | 0.006050 | -2.467200e-14 |
Last rows
| Datetime | SYM/H_INDEX_nT | 1-M_AE_nT | 400kmDensity | DAILY_SUNSPOT_NO_ | DAILY_F10.7_ | 3-H_KP*10_ | irradiance (W/m^2/nm) | d_diff | |
|---|---|---|---|---|---|---|---|---|---|
| 99990 | 2007-09-05 07:59:00 | -17.0 | 941.0 | 5.951310e-13 | 13.0 | 68.8 | 33.0 | 0.004962 | -3.458970e-14 |
| 99991 | 2007-10-31 00:13:00 | -18.0 | 129.0 | 5.022695e-13 | 0.0 | 66.1 | 13.0 | 0.004944 | 2.386270e-14 |
| 99992 | 2009-09-02 23:03:00 | -8.0 | 18.0 | 2.494270e-13 | 0.0 | 69.4 | 10.0 | 0.004985 | 2.276030e-14 |
| 99993 | 2004-07-08 13:23:00 | 6.0 | 57.0 | 6.032012e-13 | 18.0 | 84.6 | 7.0 | 0.005514 | -3.677660e-14 |
| 99994 | 2007-12-18 15:18:00 | -31.0 | 337.0 | 5.042662e-13 | 12.0 | 74.4 | 33.0 | 0.004947 | -4.261840e-14 |
| 99995 | 2009-06-06 21:53:00 | -6.0 | 82.0 | 5.747234e-13 | 0.0 | 71.1 | 7.0 | 0.005003 | -3.026780e-14 |
| 99996 | 2011-04-20 06:36:00 | -18.0 | 66.0 | 3.751719e-12 | 73.0 | 118.1 | 37.0 | 0.005592 | 3.491500e-14 |
| 99997 | 2004-07-04 04:16:00 | -10.0 | 325.0 | 5.614132e-13 | 31.0 | 82.1 | 13.0 | 0.005612 | -2.081170e-13 |
| 99998 | 2003-12-18 06:06:00 | -14.0 | 139.0 | 8.519602e-13 | 110.0 | 119.1 | 10.0 | 0.005926 | -9.095700e-15 |
| 99999 | 2008-11-28 07:47:00 | -3.0 | 23.0 | 4.736454e-13 | 0.0 | 65.2 | 7.0 | 0.004942 | 2.440020e-14 |